NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Energy-Based Models for Predicting Mutational Effects on Proteins

https://doi.org/10.1145/3711896.3736931

Soga, Patrick; Lei, Zhenyu; He, Yinhan; Bilodeau, Camille; Li, Jundong (August 2025, ACM)

Free, publicly-accessible full text available August 3, 2026
Graph Neural Networks Are More Than Filters: Revisiting and Benchmarking from A Spectral Perspective

Dong, Yushun; Soga, Patrick; He, Yinhan; Wang, Song; Li, Jundong (April 2025, International Conference on Learning Representations)

Graph Neural Networks (GNNs) have achieved remarkable success in various graph-based learning tasks. While their performance is often attributed to the powerful neighborhood aggregation mechanism, recent studies suggest that other components such as non-linear layers may also significantly affecting how GNNs process the input graph data in the spectral domain. Such evidence challenges the prevalent opinion that neighborhood aggregation mechanisms dominate the behavioral characteristics of GNNs in the spectral domain. To demystify such a conflict, this paper introduces a comprehensive benchmark to measure and evaluate GNNs' capability in capturing and leveraging the information encoded in different frequency components of the input graph data. Specifically, we first conduct an exploratory study demonstrating that GNNs can flexibly yield outputs with diverse frequency components even when certain frequencies are absent or filtered out from the input graph data. We then formulate a novel research problem of measuring and benchmarking the performance of GNNs from a spectral perspective. To take an initial step towards a comprehensive benchmark, we design an evaluation protocol supported by comprehensive theoretical analysis. Finally, we introduce a comprehensive benchmark on real-world datasets, revealing insights that challenge prevalent opinions from a spectral perspective. We believe that our findings will open new avenues for future advancements in this area.
more » « less
Free, publicly-accessible full text available April 24, 2026
Deep Interactions for Multimodal Molecular Property Prediction

https://doi.org/10.1007/978-981-96-8173-0_26

Soga, Patrick; Lei, Zhenyu; Bilodeau, Camille; Li, Jundong (June 2025, Springer Nature Singapore)

Multi-modal learning by means of leveraging both 2D graph and 3D point cloud information has become a prevalent method to improve model performance in molecular property prediction. However, many recent techniques focus on specific pre-training tasks such as contrastive learning, feature blending, and atom/subgraph masking in order to learn multi-modality even though design of model architecture is also impactful for both pre-training and downstream task performance. Relying on pre-training tasks to align 2D and 3D modalities lacks direct interaction which may be more effective in multimodal learning. In this work, we propose MolInteract, which takes a simple yet effective architecture-focused approach to multimodal molecule learning which addresses these challenges. MolInteract leverages an interaction layer for fusing 2D and 3D information and fostering cross-modal alignment, showing strong results using even the simplest pre-training methods such as predicting features of the 3D point cloud and 2D graph. MolInteract exceeds state-of-the-art multimodal pre-training techniques and architectures on various downstream 2D and 3D molecule property prediction benchmark tasks.
more » « less
Free, publicly-accessible full text available June 10, 2026
Explaining Graph Neural Networks with Large Language Models: A Counterfactual Perspective on Molecule Graphs

https://doi.org/10.18653/v1/2024.findings-emnlp.415

He, Yinhan; Zheng, Zaiyi; Soga, Patrick; Zhu, Yaochen; Dong, Yushun; Li, Jundong (January 2024, Association for Computational Linguistics)

In recent years, Graph Neural Networks (GNNs) have become successful in molecular property prediction tasks such as toxicity analysis. However, due to the black-box nature of GNNs, their outputs can be concerning in high-stakes decision-making scenarios, e.g., drug discovery. Facing such an issue, Graph Counterfactual Explanation (GCE) has emerged as a promising approach to improve GNN transparency. However, current GCE methods usually fail to take domain-specific knowledge into consideration, which can result in outputs that are not easily comprehensible by humans. To address this challenge, we propose a novel GCE method, LLM-GCE, to unleash the power of large language models (LLMs) in explaining GNNs for molecular property prediction. Specifically, we utilize an autoencoder to generate the counterfactual graph topology from a set of counterfactual text pairs (CTPs) based on an input graph. Meanwhile, we also incorporate a CTP dynamic feedback module to mitigate LLM hallucination, which provides intermediate feedback derived from the generated counterfactuals as an attempt to give more faithful guidance. Extensive experiments demonstrate the superior performance of LLM-GCE.
more » « less
Full Text Available

Search for: All records